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Abstract. A well-studied class of functions in communication complexity are com- 
posed functions of the form (/ o g n )(x,y) = f(g(x 1 ,y 1 ),...,g(x n ,y n )). This is a rich 
family of functions which encompasses many of the important examples in the liter- 
ature. It is thus of great interest to understand what properties of / and g affect the 
communication complexity of (/ o g n ), and in what way. 

Recently, Sherstov [She09] and independently Shi-Zhu [SZ09b] developed conditions on 
the inner function g which imply that the quantum communication complexity of fog n 
is at least the approximate polynomial degree of /. We generalize both of these frame- 
works. We show that the pattern matrix framework of Sherstov works whenever the 
inner function g is strongly balanced — we say that g : XxY — > { — 1, +1} is strongly bal- 
anced if all rows and columns in the matrix M g = [g(x, y)] x ,y sum to zero. This result 
strictly generalizes the pattern matrix framework of Sherstov [She09] , which has been 
a very useful idea in a variety of settings [She08b,RS08,Cha07,LS09a,CA08,BHN09]. 
Shi-Zhu require that the inner function g has small spectral discrepancy, a somewhat 
awkward condition to verify. We relax this to the usual notion of discrepancy. 
We also enhance the framework of composed functions studied so far by considering 
functions F(x,y) = f{g(x,y)), where the range of g is a group G. When G is Abelian, 
the analogue of the strongly balanced condition becomes a simple group invariance 
property of g. We are able to formulate a general lower bound on F whenever g satisfies 
this property. 



1 Introduction 

Communication complexity studies the minimum amount of communication needed to com- 
pute a function whose input variables are distributed between two or more parties. Since the 
introduction by Yao [Yao79] of an elegant mathematical model to study this question, com- 
munication complexity has grown into a rich field both because of its inherent mathematical 
interest and also its application to many other models of computation. See the textbook of 
Kushilevitz and Nisan [KN97] for a comprehensive introduction to the field. 

In analogy with traditional computational complexity classes, one can consider different 
models of communication complexity based on the resources available to the parties. Besides 
the standard deterministic model, of greatest interest to us will be a randomized version of 
communication complexity, where the parties have access to a source of randomness and are 
allowed to err with some small constant probability, and a quantum model where the parties 
share a quantum channel and the cost is measured in qubits. 

Several major open questions in communication complexity ask about how different 
complexity measures relate to each other. The log rank conjecture, formulated by Lovasz 
and Saks [LS88], asks if the deterministic communication complexity of a Boolean function 
F : X x Y —> {0, 1} is upper bounded by a polynomial in the logarithm of the rank of the 
matrix [F(x, y)] x ,y Another major open question is if randomized and quantum communica- 
tion complexity are polynomially related for all total functions. We should mention here that 
the assumption of the function being total is crucial as an exponential separation is known 
for a partial function [Raz99] . 



One approach to these questions has been to study them for restricted classes of functions. 
Many functions of interest are block composed functions. For finite sets X, Y, and E, a function 
/ : E n — > { — 1, +1}, and a function g : X x Y — > E, the block composition of / and g is the 
function fof : X n x7M {-1, +1} defined by (/ o g n )(x, y) = f(g(x\ y 1 ), g(x n ,y n )) 
where (x l , y l ) € X x Y for alii = 1, . . . , n. For example, if E = { — 1, +1}, the inner product 
function results when / is PARITY and g is AND, set-intersection when / is OR and g is 
AND, and the equality function when / is AND and g is the function IS-EQUAL, which is 
one if and only if x = y. 

In a seminal paper, Razborov [Raz03] gave tight bounds for the bounded-crror quan- 
tum communication complexity of block composed functions where the outer function / is 
symmetric and the inner function g is bitwise AND. In particular, this result showed that 
randomized and quantum communication complexity are polynomially related for such func- 
tions. 

More recently, very nice frameworks have been developed by Sherstov [She07,She09] and 
independently by Shi and Zhu [SZ09b] to bound the quantum complexity of block composed 
functions that goes beyond the case of symmetric / to work for any / provided the inner 
function g satisfies certain technical conditions. When g satisfies these conditions, this frame- 
work allows one to lower bound the quantum communication complexity of / o g n in terms 
of the approximate polynomial degree of /, a classically well-studied measure. Shi and Zhu 
are able to get a bound on / o g n in terms of the approximate degree of / whenever g is 
sufficiently "hard" — unfortunately, the hardness condition they need is in terms of "spectral 
discrepancy," a quantity which is somewhat difficult to bound, and their bound requires that 
g is a function on at least J?(log(n/d)) bits, where d is the approximate polynomial degree of 
/. Because of this, Shi-Zhu are only able to reproduce Razborov's results with a polynomially 
weaker bound. 

Sherstov developed so-called pattern matrices which are the matrix representation of a 
block composed function when g is a fixed function of a particularly nice form. Namely, 
in a pattern matrix the inner function g : { — f,+f} fc x ([fc] x { — !,+!}) — > { — !,+!} is 
parameterized by a positive integer k and defined by g(x, (i, b)) = Xi ■ b, where Xi denotes 
the i th bit of x. In other words, the first argument of g is a k bit string x, and the second 
argument selects a bit of x or its negation. So here X = { — l,+l} k ,Y = [k] x { — 1 , +1 } and the 
intermediate set E is { — I, +1 }. With this g, Sherstov shows that the approximate polynomial 
degree of / is a lower bound on the quantum communication complexity of / o g n 1 for any 
function /. Though seemingly quite special, pattern matrices have proven to be an extremely 
useful concept. First, they give a simple proof of Razborov's tight lower bounds for f(x A y) 
for symmetric /. Second, they have also found many other applications in unbounded-error 
communication complexity [She08b,RS08] and have been successfully extended to multiparty 
communication complexity [Cha07,LS09a,CA08,BHN09]. 

A key step in both the works of Sherstov and Shi-Zhu is to bound the spectral norm of 
a sum of matrices || J2 i Bi\\. This is the major step where these works differ. Shi-Zhu apply 
the triangle inequality to bound this as || J2i ^11 ^ J2i On the other hand, Sherstov 

observes that in the case of pattern matrices the terms of this sum are mutually orthogonal, 
i.e. B\Bj = BiBj = for all i ^ j. In this case, one has a stronger bound on the spectral 
norm || j>2i B i\\ = max » \\ B i\\- 

In this paper, we extend both of the frameworks of Sherstov and Shi-Zhu. In the case of 
Shi-Zhu, we are able to reprove their theorem with the usual notion of discrepancy instead 
of the somewhat awkward spectral discrepancy they use. The main observation we make is 
that as all Shi-Zhu use in this step is the triangle inequality, we can repeat the argument 
with any norm here, including discrepancy itself. 

In the case of pattern matrices, special properties of the spectral norm are used, namely 
the fact about the spectral norm of a sum of orthogonal matrices. We step back to see what 



key features of a pattern matrix lead to this orthogonality property. We begin with the 
Boolean case, that is, where the intermediate set E is taken to be {— 1,+1}. In this case, a 
crucial concept is the notion of a strongly balanced function. We say that j:Xx7^{-1,+1} 
is strongly balanced if in the sign matrix M g [x, y] = g(x, y) all rows and all columns sum to 
zero. We show that whenever the inner function g is strongly balanced, the key orthogonality 
condition holds; this implies that whenever g is strongly balanced and the communication 
matrix of g has rank larger than one, the approximate degree of the outer function / is a 
lower bound on the quantum communication complexity of / o g n . 

The requirement that the communication matrix of g has rank larger than one is necessary 
for such a statement. For example, when g(x,y) = ®(x,y) is the XOR function on one bit, 
then the communication complexity of PARITY o g n is constant, while PARITY has linear 
approximate polynomial degree. It turns out that when g is rank-one, the appropriate measure 
of the complexity of / o g n is no longer the approximate degree of /, but the minimum £\ 
norm of Fourier coefficients of a function entrywise close to /; see the survey [LS09b] for a 
description of this case. 

We also consider the general case where the intermediate set is any group G. That is, 
we consider functions F(x,y) — f(g(x,y)), where g : X x Y — > G for a group G and / : 
G->{- 1, +1} is a class function on G. The case E = {— 1, +1} discussed above corresponds 
to taking the group G — Zg. When G is a general Abelian group, the key orthogonality 
condition requires more than that the matrix M g [x, y] = g(x,y) is strongly balanced; still, 
it admits a nice characterization in terms of group invariance. A multiset T e G x G is 
said to be G-invariant if (s, s)T = T for all s e G. The orthogonality condition will hold 
if and only if all pairs of rows and all pairs of columns of M g (when viewed as multisets) 
are G- invariant. One can generalize the results discussed above to this general setting with 
appropriate modifications. In the case that G = ZJJ, the G-invariant condition degenerates 
to the strongly balanced requirement of M g . 

2 Preliminaries 

All logarithms are base two. For a complex number z = a + ib we let z — a — ib denote the 
complex conjugate of z and \z\ = V a 2 + b 2 and Re(z) = a. 

2.1 Complexity measures 

We will make use of several complexity measures of functions and matrices. Let / : {— 1, +1}" — > 
{— 1,+1} be a function. For T C {0,1}™, the Fourier coefficient of / corresponding to the 
character xt is /t = J2x f( x )xr(x) = ^ J2x f( x ) ILeT x i- Tne degree of / as a polyno- 
mial, denoted deg(/), is the size of a largest set T for which fx ^ 0. 

We will need some notations for matrices. We reserve J for the all ones matrix, whose 
size will be determined by the context. For a matrix A let A 1 * denote the conjugate transpose 
of A. We use A • B for the entrywise product of A, B, and A <g) B for the tensor product. If 
A is an m-by-n matrix then we say that size(A) = ran. We use (A,B) = Tr(AB^) for the 
inner product of A and B. 

Let \\A\\i be the l\ norm of A, i.e. sum of the absolute values of entries of A, and m||oo 
the £oo norm. For a positive semidefinite matrix M let Ai(M) > • • • > A„(M) > be the 
eigenvalues of M. We define the i th singular value of A, denoted <Ji(A), as <Ji{A) = ^/Aj(AAt). 
The rank of A, denoted vk(A) is the number of nonzero singular values of A. We will use 
several matrix norms. The spectral or operator norm is the largest singular value ||A|| = 
ai(A), the trace norm is the summation of all singular values ||A||t r = ^2 i <Ji(A) 1 and the 
Frobenius norm is the £ 2 norm of the singular values \\A\\p = a i(A) 2 - 



When AB^ = A^B = we will say that A, B arc orthogonal. Please note the difference 
with the common use of this term, which usually means (A, B) = 0. The following facts arc 
easily seen. 



Fact 1 Let A, B be two matrices of the same dimensions and suppose that AB^ = A^B = 0. 
Then 

rk(A + B) = rk(A) + rk(fl), \\A + B\\ tr = \\A\\ tr + \\B\\ tr , \\A + B\\ = max{\\A\\, \\B\\}. 

Another norm we will use is the 72 norm, introduced to complexity theory in [LMSS07], 
and familiar in matrix analysis as the Schur product operator norm. The 72 norm can be 
viewed as a weighted version of the trace norm. 



Definition 1. 



12(A) = max \\A»uv f \\ tr . 

u,v.\\u\\ = \\v\\ = l 



Here Au B denotes the entrywise product of A and B. It is clear from this definition that 
72(^4) > || A\\ tr /\/mn for a m-by-n matrix A. 

For a norm <£, the dual norm <P* is defined as <l>*(v) = max u .<p(„)<i |(u, v)\. For example, 
the £00 norm is dual to the t\ norm, and the spectral norm is dual to the trace norm. 

The norm 73, dual to the 72 norm, looks as follows. 

Definition 2. 

7* (A) = max A l { ' 3] ' v i > ■ 

11^11=11^11=1 i,j 

Another complexity measure we will make use of is discrepancy 

Definition 3. Let A be an m-by-n sign matrix and let P be a probability distribution on the 
entries of A. The discrepancy of A with respect to P, denoted discp(^4), is defined as 

discpM) = max \x^A»Py\. 

' xe{o,i} m 
ye{o,i}" 

We will write disc;/ (A) for the special case where P is the uniform distribution. It is easy 
to see from this definition that disc^/M) < —^M=. Shaltiel [Sha03l has shown the deeper 

result that this bound is in fact polynomially tight: 
Theorem 2 (Shaltiel). Let A be a sign matrix. Then 

Discrepancy and the 72 norm are very closely related. Linial and Shraibman [LS09c] 
observed that Grothcndieck's inequality gives the following. 

Theorem 3 (Linial- Shraibman). For any sign matrix A and probability distribution P 

discp(^) < 72 (A •P)<K G discp(A) 
where 1.67 .. . < K G < 1.78 .. . is Grothendieck's constant. 



Approximate measures We will also use approximate versions of these complexity mea- 
sures which come in handy when working with bounded-error models. Say that a function g 
gives an e-approximation to / if \ f(x) — g(x)\ < e for all x G { — 1, +1}™. The e-approximate 
polynomial degree of /, denoted deg e (/), is the minimum degree of a function g which gives 
an e-approximation to /. 

We will similarly look at the e-approximate version of the trace and 72 norms. We give 
the general definition with respect to any norm. 

Definition 4 (approximation norm). Let <P : K™ — >• M be an arbitrary norm. Let v e R™ 
be a sign vector. For < e < 1 we define the approximation norm <P e as 

<£ e (u) = min <P(u). 

||ti— u||oo<e 

Notice that an approximation norm <P e is not itself a norm — we have only defined it for sign 
vectors, and it will in general not satisfy the triangle inequality. 

As a norm is a convex function, using the separating hyperplane theorem one can quite 
generally give the following equivalent dual formulation of an approximation norm. 

Proposition 1. Let be a sign vector, and < e < 1 

mi n \( v i u )\ - e \\ u h 

& (v) = max , '' 

A proof of this can be found in the survey [LS09b] . 



2.2 Communication complexity 

Let X,Y,S be finite sets and / : I x 7 4 S be a function. We will let D(f) be the 
deterministic communication complexity of /, and R e (f) denote the randomized public coin 
complexity of / with error probability at most e. We refer to the reader to [KN97] for a 
formal definition of these models. We will also study Q e {f) and Q*(f), the e-error quantum 
communication complexity of / without and with shared entanglement, respectively. We refer 
the reader to [Raz03] for a nice description of these models. 

For notational convenience, we will identify a function / : X x Y — > {— 1, +1} with its 
sign matrix Mf = [f{x,y)] x ,y Thus, for example, ||/|| refers to the spectral norm of the sign 
matrix representation of /. 

For all of our lower bound results we will actually lower bound the approximate trace 
norm or 72 norm of the function. Razborov showed that the approximate trace norm can 
be used to lower bound on quantum communication complexity, and Linial and Shraibman 
generalized this to the 72 norm. 

Theorem 4 (Linial-Shraibman [LS09d]). Let A be a sign matrix and < e < 1/2. Then 

Q:(A)> log ( 1 ?-(A))-2. 

Composed functions Before discussing lower bounds on a block composed function fog 11 , 
let us see what we expect the complexity of such a function to be. A fundamental idea going 
back to Nisan [Nis94] and Buhrman, Cleve, and Wigderson [BCW98], is that the complexity 
°f / .9™ can De related to the decision tree complexity, also known as query complexity, of 
/ and the communication complexity of g. Let DT(/) be the query complexity of /, that is 
the number of queries of the form Xj =? needed to evaluate f(x) in the worst case. Similarly, 
let RT e (/), QT e (/) denote the randomized and quantum query complexity of / respectively, 
with error probability at most e. For formal definitions of these measures and a survey of 
query complexity we recommend Buhrman and de Wolf [BW02] . 



Theorem 5 (Nisan [Nis94], Buhrman-Cleve-Wigderson [BCW98]). For any two 

Boolean functions f : {-1, +1}™ ->• {-1, +1} and g : X x Y ->• {-1, +1}, 

D{fog n ) = 0{UT{f)D(g)) 
Ri/i{f°9 n ) = 0(RT 1/4 (/)fl 1/4 ( 5 )logRT 1/4 (/)) 
Qi/4(f°9 n ) = 0(QT 1/4 (/)Q 1/4 ( 5 )logn). 

One advantage of working with block composed functions in light of this upper bound 
is that query complexity is in general better understood than communication complexity. 
In particular, a polynomial relationship between deterministic query complexity and degree, 
and randomized and quantum query complexities and approximate degree is known. 

Theorem 6 ([NS94,BBC+01]). Let f : {0,1}" -> {-1,+1}. Then 

DT(/) = 0(deg(/) 4 ), DT(/) = 0(dcg 1/4 (/) 6 ) 

Using this result together with Theorem 5 gives the following corollary: 
Corollary 1. 

D(f o g n ) = 0(deg(/) 4 £>( 5 )), R 1/4 (f o g n ) = 0(deg 1/4 (f) 6 R 1/4 (g) log deg 1/4 (/)) 

Our goal, then, in showing lower bounds on the complexity of a block composed function 
/ o g n is to get something at least in the ballpark of this upper bound. Of course, this is 
not always possible — the protocol given by Theorem 5 is not always optimal. For example, 
when / is the PARITY function on n bits, and g{x,y) = ®(x,y) this protocol just gives an 
upper bound of n bits, when the true complexity is constant. See recent results by Zhang 
[Zha09] and Sherstov [ShclO] for discussions on the tightness of the bounds in Theorem 5. 

3 Rank of block composed functions 

We begin by analyzing the rank of a block composed function / o g n when the inner function 
g is strongly balanced. This case will illustrate the use of the strongly balanced assumption, 
and is simpler to understand than the bounded-error situation treated in the next section. 
Let us first formally state the definition of strongly balanced. 

Definition 5 (strongly balanced). Let A be a sign matrix, and J be the all ones matrix 
of the same dimensions as A. We say that A is balanced if Tr (A J^) = 0. We further say 
that A is strongly balanced if AJ^ = A^J = 0. In words, a sign matrix is strongly balanced 
if the sum over each row is zero, and similarly the sum over each column is zero. We will 
say that a two-variable Boolean function is balanced or strongly balanced if its sign matrix 
representation is. 

Theorem 7. Let f : { — 1, +1}™ — > { — 1, +1} be an arbitrary function, and let g be a strongly 
balanced function. Then 

vk(M fogn )= £ rk(M 9 )l T L 

TC[n], / T #0 

Proof. Let us write out the sign matrix for xt ° g n explicitly. If we let M° = J be the 
all ones matrix and M 1 = M g , then we can nicely write the sign matrix representing 
Xrigix^y 1 ),. . . ,g(x n ,y n )) as 



where T[i] = 1 if i £ T and otherwise. 

We see that the condition on g implies M XTOg ^M^ s0gn = if S ^ T. Indeed, 




= (g)(MjW(Mfl)t) =0. 

i 

This follows since, by the assumption S ^ T, there is some i for which S[i] ^ T[i] which 
means that this term is either M g J^ = or JMj = because g is strongly balanced. The 
other case follows similarly. 

Now that we have established this property, we can use Fact 1 to obtain 

rk(M /og n)=rk( J2 fTXT(g(x 1 ,y 1 ),...,g(x n ,y n ))) 

TC[n] 

= E rk (M xr o S -) 

TC[n] 
./Wo 

= E rk ( M .) |T| 

TC[n] 

In the last step we used the fact that rank is multiplicative under tensor product. 

Theorem 7 has the following implication for the log rank conjecture of the composed 
function with the assumption of the same conjecture for the inner function. 

Corollary 2. Let X, Y be finite sets, g : X xY — >{ — 1, +1} be a strongly balanced function, 
and M g [x, y] — g{x,y) be the corresponding sign matrix. Let f : {— 1,+1}" — > {— 1,+1} be 
an arbitrary function. Assume that rk(M 9 ) > 2 and further suppose that there is a constant 
c such that D(g) < (logrk(M s )) c . Then 

D(fog n ) =0(logrk(/o 5 ) 4 + c ). 

Proof. By Corollary 1, D(f o g n ) = 0(dcg(f) 4 D(g)) - 0(dcg(/) 4 (logrk(M 9 )) c ). Now, it 
follows from Theorem 7 that logrk(/o g) > deg(/) logrk(M 9 ) as by definition of degree there 
is some T C {0, 1}" with \T\ = deg(/) and f T ^ 0. 

In particular, this Corollary means that whenever g is a strongly balanced function on a 
constant number of bits and rk(M g ) > 1, then the log rank conjecture holds for f o g n . If g is 
strongly balanced and rk(M g ) = 1 then, up to permutation of rows and columns, which does 
not change the communication complexity, M g is a tensor product of the XOR function with 
an all ones matrix. The log rank conjecture in the case (/° ©")(£, y) = f{x\ ®yi, . ■ . , x n (By n ) 
remains an interesting open question. Shi and Zhang [SZ09a] have recently resolved this 
question when / is symmetric. 

4 A bound in terms of approximate degree 

In this section, we will address the frameworks of Sherstov and Shi-Zhu. We extend both of 
these frameworks to give more general conditions on the inner function g which still imply 
that the approximate degree of / is a lower bound on the quantum query complexity of 



the composed function / o g n . In outline, both of these frameworks follow the same plan. 
By Theorem 4 it suffices to lower bound the approximate 72 norm (or even approximate 
trace norm) of / o g n . To do this, they use the dual formulation given by Proposition 1 and 
construct a witness matrix B which has non-negligible correlation with the target function 
and small 72 (or spectral) norm. 

A very nice way to construct this witness, used by both Shcrstov and Shi-Zhu, is to use the 
dual polynomial of /. This is a polynomial v which certifies that the approximate polynomial 
degree of / is at least a certain value. More precisely, duality theory of linear programming 
gives the following lemma. 

Lemma 1 (Sherstov [She09], Shi-Zhu [SZ09b]). Let f : {-1,+1}" -> {-1,+1} and let 

d = deg e (/). Then there exists a function v : { — 1, +1}™ — > R such that 

1. (v,Xt) = for every character \T with \T\ < d. 

2. Hi = l. 

3. (v,f) > e. 

Items (2), (3) are used to lower bound the correlation of the witness matrix with the target 
matrix and to upper bound the l\ norm of the witness matrix. In the most difficult step, and 
where these works diverge, Item (1) is used to upper bound the 72 (or spectral) norm of the 
witness matrix. 

We treat each of these frameworks separately in the next two sections. 



4.1 Sherstov's framework 



The proof of the next theorem follows the same steps as Sherstov's proof for pattern matrices 
(Theorem 5.1 [She09]). Our main contribution is to identify the strongly balanced condition 
as the key property of pattern matrices which enables the proof to work. 

Theorem 8. Let X, Y be finite sets, g : X xY — > { — 1, +1} be a strongly balanced function, 
and M g [x,y] = g{x,y) be the corresponding sign matrix. Let f : {— 1,+1}" — > {— 1,+1} be 
an arbitrary function. Then 



Qt(fog») > deg eo (/)log 2 (^jp) - 0(1)- 
for any e > and e > 2e. 

In particular, this result means that the quantum and randomized complexities of / o g n 

are polynomially related whenever g is strongly balanced and log ^^m^^ ' * s P°ly nom i a Uy 
related to the randomized communication complexity of g. While the complexity measure of 
g used here may look strange at first, Shaltiel [Sha03] has shown that it is closely related 
to the discrepancy of g under the uniform distribution, as noted above in Theorem 2. This 
theorem strictly generalizes the case of pattern matrices, but it could still be the case that 
the results of Shi-Zhu can show bounds not possible with this theorem. 

Proof (Proof of Theorem 8). Let d = deg e (/) and let v be a dual polynomial for / with 
properties as in Lemma 1. We define a witness matrix as 



on 

B[X ' V] = si Z e(M g r v{9{xl ' ylh --- 9{xn ' yn)) 



Let us first lower bound the inner product (Mf 0gn , B) . Notice that as M g is strongly 
balanced, it is in particular balanced, and so the number of ones (or minus ones) in M g is 
sizc(M g )/2. 

2™ n / \ 

(%f. B >--^ e /(*m*) n( e o =</>«>>*> 

V 9 ' ze{-l, + l} n »=1 x*,w<: 

A similar argument shows that 1 1 -B 1 1 1 = 1 as \\v\\i = 1. 

Now we turn to evaluate ||B||. As shown above, the strongly balanced property of g 
implies that the matrices xt ° g n and xs ° 5™ are orthogonal for distinct sets S, T C {0, 1}™. 
We can thus use Fact 1 to compute as follows. 

V ff/ TC[n] 
2™ 

■max|v T |||M XTOff «|| 



size(M 9 ) n t 



= max 2"|« T |TT . l|MffTH 



< max TT — 



A/, 



size(M g ) 



TrST^o 11 size(M g ) 



/ 1 



i/2 



^ Vsize(M s ) / V size (M 9 ). 

In the second to last step we have used that \vt\ < 1/2™ as ||v||i = 1, and in the last step 
we have used the fact that || J|| = ^/sizc(M g ). 
Now putting everything together we have 



II AW || £ > 1 / y/sizc(M 3 ) \ 
^siz C (M fogn ) ~12\ \\M g \\ J 

The lower bound on quantum communication complexity now follows from Theorem 4. 

Using the theorem of Shaltiel relating discrepancy to the spectral norm Theorem 2, we 
get the following corollary: 



Corollary 3. Let the quantities be defined as in Theorem 8. 

1 , , -w, / 1 



Ql /8 (f o *») > 3 deg 1/3 (/)( log (_ _) - 7) - 0(1). 



Comparison to Sherstov's pattern matrix: As mentioned in [She09], Sherstov's pattern 
matrix method can prove quantum lower bound of J2(deg e (/)) for block composed functions 
/ o g n if the matrix M g contains the following 4x4 one as a submatrix: 



1-11-1 
1-1-11 
-11 1-1 
-11-11 



In this paper we show that the same lower bound holds as long as M g contains a strongly 
balanced submatrix or rank greater than one. Are there strongly balanced matrices not 



containing S4 as a submatrix? It turns out that the answer is yes: we give the following 6x6 
matrix as one example. 

"1 1 1-1-1 -1" 
1 1-11-1-1 
1-1-1-11 1 
-1-11 1 1-1 
-11-1-11 1 
-1-11 1-11 



S 6 = 



4.2 Shi-Zhu framework 

The method of Shi-Zhu does not restrict the form of the inner function g, but rather works 
for any g which is sufficiently "hard." The hardness condition they require is phrased in terms 
of a somewhat awkward measure they term spectral discrepancy. 

Definition 6 (spectral discrepancy). Let A be a m-by-n sign matrix. The spectral dis- 
crepancy of A, denoted p{A), is the smallest r such that there is a submatrix A' of A and a 
probability distribution \i on the entries of A 1 satisfying: 

1. A' is balanced with respect to p, i.e. the distribution which gives equal weight to — 1 entries 
and +1 entries of A' . 

2. The spectral norm of A' • \i is small: 

P'*mII< 



v /size(A / ) 

3. The entrywise absolute value of the matrix A 1 • p should also have a bound on its spectral 
norm in terms ofr: 

UK » Ml II < /. + r 

yf size (A' ) 

While conditions (1),(2) in the definition of spectral discrepancy are quite natural, condi- 
tion (3) can be complicated to verify. Note that condition (3) will always be satisfied when 
p is taken to be the uniform distribution. Using this notion of spectral discrepancy, Shi-Zhu 
show the following theorem. 

Theorem 9 (Shi-Zhu [SZ09b]). Let f : {-1,+1}" — > {-1,+1}, and g : X x Y -S- 
{ — 1, +1}. For any e and eo > 2e ; 

Qdf o 9 n ) >r?(dc geo (/)). 

provided p(M g ) < — y 2 -^ ■ Here e = 2.718 ... is Euler's number. 

Chattopadhyay [Cha08] extended the technique of Shi-Zhu to the case of multiparty 
communication complexity, answering an open question of Sherstov [She08a]. In doing so, he 
gave a more natural condition on the hardness of g in terms of an upper bound on discrepancy 
frequently used in the multiparty setting and originally due to Babai, Nisan, and Szegedy 
[BNS92]. As all that is crucially needed is subadditivity, we do the argument here with 7^, 
which is essentially equal to the discrepancy. 

Theorem 10. Let f : {-1, +1}" -> {-1, +1} 7 and g : X xY -> {-1, +1}. Fix < e < 1/2, 
and let e > 2e. Then 

0:(/o.g")>deg eo (/)-0(l). 
provided there is a distribution fi which is balanced with respect to g and for which 72 (M g • 



I*) < 



deg eQ (/) 
2en 



Proof. We again use Proposition 1, this time with the 72 norm instead of the trace norm. 



^(M f ..) (M fog ,,B)-e \\B\\ 1 
To prove a lower bound we choose a witness matrix B as follows 

n 

B[x, y]=T- V), g(x n ,y n )) ■ JT »(x\ y% 

i=l 

where v witnesses that / has approximate degree at least d = dcg e (/). This definition is 
the same as in the previous section where \i was simply the uniform distribution. As argued 
before, we have (Mf og n,B) > e and ||B||i = 1 because M g • \i is balanced. 
We again expand B as 

n 

B = T i>T<S)( M 9^) T{i \ 

T:\T\>d i=l 

where (M g • /i) 1 = M g • \x and (M s • = /1. 

Now comes the difference with the previous proof. As we do not have special knowledge 
of the function g, we simply bound 72(B) using the triangle inequality. 

72*(B)<2" Yl IH7 2 *((g)(M 9 .M) T «) 

T:\T\>d \»=1 / 

= 2" E l*T|72(M 9 .^) |T| 72(A*)"- |T| 

T:|T|>d 

< E 7 2 *(M 9 . M ) |T| , 

T:|T|>d 

where in the last step we have used that 7K/1) < 1 as \i is a probability distribution and 
that |wt| < 2~ n . In the second step (equality) we used the fact that 72 is multiplicative with 
respect to tensor product, a property proved in [LSS08]. We continue with simple arithmetic: 



72 (5) <E(Y) ^ (M fl -/i) 
A / enj*(M g *n) 



^E 

i—d 

< 2~ d 



provided that 73 (M fl • /x) < 



5 A general framework for functions composed through a group 

In this section we begin the study of more general function composition through a group G. In 
this case the outer function / : G — > { — 1, +1} is a class function, i.e. invariant on conjugacy 
classes, and the inner function g : X xY — >• G has range G. We define the composed function 
as F(x,y) — f(g(x,y)). In previous sections of the paper we have just dealt with the case 
G = l%. 



Let us recall the basic idea of the proof of Theorem 8. To prove a lower bound on the 
quantum communication complexity for a composed function / oj, we constructed a witness 
matrix B which had non-negligible correlation with fog and small spectral norm. To do 
this, following the work of Sherstov and Shi-Zhu [She09,SZ09b], we considered the dual 
polynomial p of f using LP duality. The dual polynomial has two important properties, first 
that p has non-negligible correlation with / and second that p has no support on low degree 
polynomials. We can then use the first property to show that the composed function p o g 
will give non-negligible inner product with fog and the second to upper bound the spectral 
norm of p o g. The second of these tasks is the more difficult. In the case of G = { — 1, +1}", 
the degree of a character \t is a natural measure of how "hard" the character is — the larger 
T is, the smaller the spectral norm of \t ° 9 will be. In the general group case, however, it is 
less clear what the corresponding "hard" and "easy" characters should be. In Section 5.1, we 
will show that this framework actually works for an arbitrary partition of the basis functions 
into Easy and Hard. That is for any arbitrary partition of the basis functions into Easy and 
Hard sets we can follow the plan outlined above and look for a function with support on the 
Hard set which has non- negligible correlation with /. 

In carrying out this plan, one is still left with upper bounding ||M pog ||. Here, as in the 
Boolean case, it as again very convenient to have an orthogonality condition which can 
greatly simplify the computation of ||M pog || and give good bounds. In the Boolean case we 
have shown that M g being strongly balanced implies this key orthogonality condition. In 
Section 5.2 and 5.3, we will show that for the general group, the condition is not only about 
each row and column of matrix M g , but all pairs of rows and pairs of columns. In the Abelian 
group case, this reduces to a nice group invariance condition. 

Even after applying the orthogonality condition to use the maximum bound instead of the 
triangle inequality for ||M p0 g|| , the remaining term ||M Xi0g || (where \i is a "hard" character) is 
still not easy to upper bound. For block composed functions, fortunately, the tensor structure 
makes it feasible to compute. Section 5.4 gives a generalized version of Theorem 8. 



5.1 General framework 

For a multiset T, x e T means x running over T. Thus T — {a(s) : s e S} means the multiset 
formed by collecting a(s) with s running over S. 

For a set S, denote by L C (S) the |S|-dimensional vector space over the field C (of complex 
numbers) consisting of all linear functions from S to C, endowed with inner product (tp, 4>) = 
jk] SseS V'( s )0( s )- The distance of a function / e Lc(S) to a subspace <P of Lc(S), denoted 
by d{f,<P), is defined as min{<5 : \\f — /Hoc < 5,f € <P} 7 i.e. the magnitude of the least 
entrywise perturbation to turn / into <P. 

In the above setting, Theorem 8 generalizes to the following. 

Theorem 11. Consider a sign matrix A = [f(g(x, y))]x, y where g : X xY — »■ S for a set S, 
and f : S — > { — 1, +1 }. Suppose that we can find an orthogonal basis functions <P = {ipi : i e 
for Lc{S). For any hardness partition 9 = 'Pnard W ^Easy, let 5 — d(f,span('I r Easy)). 

If 

1. (regularity) The multiset {g(x,y) : x € X, y e Y} is a multiple of S, i.e. S repeated for 
some number of times. 

2. (orthogonality) for all x,x',y,y' and all distinct V'iiV'j G ^Hard, 

^i)i{g{x,y))i)j{g{x' ,y)) = ^ j i) i {g(x,y))i) j {g{x,y 1 )) =0, 

v x 

then 



Q e (A) > log 2 V -^-^ - O 1 . 

max^ efl / Hor<J (max g \ipi{g)\ ■ \\m{g{x,y))]x,v\\) 



Using the idea of finding a certificate of the high approximate degree by duality [Shc09,SZ09b] , 
we have the following fact analogous to Lemma 1. 

Lemma 2. For a function f : S —¥ C and a subspace <P of Lc(S), if d(f, span(<!>)) = 5, then 
there exists a function h s.t. 

hi = 0, e <2> (1) 

E \ h (9)\ < 2, (2) 
gee 

|£/( 3 )%)|>«5 (3) 

g£G 

Using the lemma, we can prove the Theorem 11. 

Proof, (of Theorem 11) By the regularity property, we know that when (x, y) runs over X xY , 
g(x,y) runs over S exactly K times where K = MN/\G\. Consider B = j^[h(g(x,y))] XiV , h 
obtained by Lemma 2; we want to apply Proposition 1 and Theorem 4 by using this B. First, 

\\Bh = ±J2\h(g(x,y))\ = J2\H9)\<l- (4) 

x,y geG 

Also, 

\{A,B)\ = ±\J2f(g(x,y))h(g(x,y))\ 

x,y 

H£/(s)%)l><5 

geG 

Now we need to compute 

II B H = ^11 [ S h iXi(9{x,y))] x , y \\. 

Note that 

bl>i(9(x,y))]l, y bPj{g(x,y))]x,y = [^2M9(x,y))ipj(g(x,y'))] y , y > 

X 

and 

[i ) i{9{x,y))] x ,y[i}j{g{x,y))]l^ y = [^2ipi{9(x,y))ipj(g{x',y))] x , x >. 

v 

Thus the orthogonality condition implies that 

[ipi(9(x,y))}tjil>j(g(x,y))} x , v = [M9(x,y))]x,ybPj(9(x,y))]t,y = o 

for all i ^ j. Now as in [She09], we can use the max bound 

= \p . max ||/ij[Vi(z,y)k2/|| 
-17 ,N , m 3 x JftM^kyll 

i\ i-.ipiGHard tpi^Hard 

- , max (max | | • || [^(x, y)] x , y ||) . 

where the last inequality is due to Eq. (5) and Eq. (4). Finally note K\G\ = MN to complete 
the proof. 



In the Boolean block composed function case, the regularity condition reduces to the 
matrix [g(x,y)} being balanced, and later we will prove that the orthogonality condition 
reduces to the strongly balanced property. From this theorem we can see that the way to 
partition IP into ^Easy and ^Hard does not really matter for the lower bound proof passing 
through. However, the partition does play a role when we later bound the spectral norm in 
the denominator. 

5.2 Functions with group symmetry 

For a general finite group G, two elements s and t are conjugate, denoted by s <~ t, if there 
exists an element r G G s.t. rsr^ 1 = t. Define H as the set of all class functions, i.e. functions 
/ s.t. f(s) = f(t) if s <~ t. Then H is an /i-dimensional subspace of Lc(G), where h is the 
number of conjugacy classes. The irreducible characters {\i ■ % G [h]} form an orthogonal 
basis of H. For a class function / and irreducible characters \i-> denote by fa the coefficient 
of Xi in expansion of / according to Xi's, i.e. U = (Xi,f) = pj- E s eG Xi(sO/(sO- An easy 
fact is that for any i, we have 




If G is Abelian, then it always has |Xi(sOI = 1) thus max^ < J2 g eG \f(s)\- F° r general 
groups, we have |xiQ?)l < deg(xi)j where deg(xi) is the degree of Xii namely the dimension 
of the associated vector space. 

In this section we consider the setting that S is a finite group G. The goal is to exploit 
properties of group characters to give better form of the lower bound. In particular, we hope 
to see when the second condition holds and what the matrix operator norm || [tp^g(x, y))] x ,y\\ 
is in this setting. 

The standard orthogonality of irreducible characters says that X^seG Xi{ s )Xj( s ) = 0- The 
second condition in Theorem 11 is concerned with a more general case: For a multiset T with 
elements in G x G, we need 

E Ms)xi(ij = 0, Vi^j. (6) 

(s,t)6T 

The standard orthogonality relation corresponds to the special that T = {(s, s) : s <E G}. We 
hope to have a characterization of a multiset T to make Eq. (6) hold. 

We may think of the a multiset T with elements in set S as a function on S, with the 
value on s € S being the multiplicity of s in T. Since characters are class functions, for each 
pair (Cfe, Ci) of conjugacy classes, only the value ^2 gieCk tec t T(gi,t) matters for the sake of 
Eq. (6). We thus make T a class function by taking average within each class pair (Cfe, Ci). 
That is, define a new function T' as 

T'(s,t) = £ T{s,t)/{\C k \\C l \), Vse C k , VteQ. 

sec k ,teCi 

Proposition 2. For a finite group G and a multiset T with elements in G xG, the following 
three statements are equivalent: 

L T,( s ,t)eTXi(s)Xj{t) = 0, Vi 

2. T', as a function, is in span{xi ®Xi '■ i G [h]} 

3. [T'(s,t)] s , t — DC where D is a diagonal matrix and C = [xi(s)]i, s - That is, T' , as a 
matrix, is normal and diagonalized exactly by the irreducible characters. 



Proof. Let H 2 be the subspace consisting functions / :GxG^C s.t. f(s,t) = f(s',t') if 
s <~ s', t <~ t'. Note that for direct product group G x G, {xi ® X7 : *>i} form an orthogonal 
basis of i7 2 : 

unless i = i' and j = j'. Note that by viewing T as a function from G x G to C, the Eq. (6) 
and the definition of T" imply that 

Thus the first two statements are equivalent. 
Note that 

T' e span{x* ® Xi : » € [ft]} <^ T'(s, t) = E c*iXi(s)Xi(i) for som e «i's 

i 

Denote by G/i X |g| = [Xi(ff)]»,g the matrix of the character table. Then observe that the sum- 
mation in the last equality is nothing but the (s, t) entry of the matrix C^diag(ai, • • • , oth)C. 
Therefore the equivalence of the second and third statements follows. 

5.3 Abelian group 

When G is Abelian, we have further properties to use. The first one is that |xi(#)l = 1 f° r 
all i. The second one is that the irreducible characters are homomorphisms of G; that is, 
Xi(st) — Xi( s )Xi{t)- This gives a clean characterization of the orthogonality condition by 
group invariance. For a multiset T, denote by sT another multiset obtained by collecting all 
st where t runs over T. A multiset T with elements in G x G is G invariant if it satisfies 
(g,g)T = T for all g e G. We can also call a function T : G x G -> C G invariant if 
T(s, t) = T(rs, rt) for all r, s, t € G. The overloading of the name is consistent when we view 
a multiset T as a function (counting the multiplicity of elements) . 

Proposition 3. For a finite Abelian group G and a multiset T with elements in G x G, 

T is G invariant ^ E Xi( s )Xj(t) = 0, Vi ^ j. (7) 
(s.t)eT 

Proof. =>: Since T is G invariant, T = (r,r)T and thus, 

E »(s)x7w= E ( 8 ) 

(s,i)€T (s,t)e(r,r)T 

= E XiMttK) (9) 
(«',t')eT 

Now using the fact that irreducible characters of Abelian groups are homomorphisms, we 
have 

E Xi(rs')Xj(rt') = E X l {r)Xr{s')Xj{r)Xj{t') 

=Xi(r)x7H( E ^ s ')xW)) 

(s',t')€T 



But note that this holds for any r g G, thus also for the average of them. That is, 

E x i ( 5 )^)=T^(E»( r )^w)( E x^')rf))=o, 

(a,t)€T ' ' reG (s',t')eT 

by the standard orthogonality property of different irreducible characters. 

<=: Since ^\ ^ gT Xi( s )Xj (*) = 0, Vi 7^ j, we know that T as a function is in span{xi<S>Xi : 
i}. Note that any linear combination of G invariant functions is also G invariant. Thus it 
remains to check that each basis Xi ® Xi 1S G invariant, which is easy to see: 



Xi( r s)Xi{rt) = Xi(r)Xi( s )Xi( r )Xi(t) = Xi{s)Xi{t)- 

This finishes the proof. 

Another nice property of Abelian groups is that the orthogonality condition condition 
implies the regularity one. 

Proposition 4. For an Abelian group G, if either T v ' v is G invariant for all y or S x ' x is G 
invariant for all x, then G\{g(x, y) : x £ X,y eY}. 

Proof. Note that T v ' v (s,s) = \{x : g(x,y) = s}\, thus T y ' v being G invariant implies that 
I {a; : g(x,y) = s}\ — \{x : g(x,y) = t}\ for all s,t £ G. Thus the column y in matrix 
[g(x, y)] x ,yi when viewed as a multiset, is equal to G repeated |Y~|/|G| times. Therefore the 
whole multiset {g(x, y) : x e X, y e Y} is a multiple of G as well. 

What we finally get for Abelian groups is the following. 

Corollary 4. For a sign matrix A = [f(g(x, y))]x,y and an Abelian group G, ifd(f, span(ChEasy 
Q(l), and the multisets S x ' x ' = {(g(x, y), g(x', y)) : y G Y} andTV^' = {(g(x,y),g(x,y')) : 
x e X} are G invariant for any (x,x r ) and any (y,y'), then 



JMN 

Q{A) > log 2 ; - 0(1). 

max ieHa rd \\[Xi{g{x,y))\x,y\\ 

5.4 Block composed functions 

We now consider a special class of functions g: block composed functions. Suppose the group 
G is a product group G = G\ x • • • x G t , and g(x, y) — (gi(x 1 , y 1 ), ■ ■ ■ , gt(x l , y*)) where x = 
(x 1 , ■ ■ ■ , x*) and y — (y 1 , ■ ■ ■ ,y t ). That is, both x and y are decomposed into t components 
and the z-th coordinate of g(x, y) only depends on the i-th components of x and y. The 
tensor structure makes all the computation easy. Theorem 8 can be generalized to the general 
product group case for arbitrary groups Gi . 

Definition 7. The e-approximate degree of a class function f on product group Gi x • • • x Gt, 
denoted by d e (f), is the minimum d s.t. \\f— f'\\oo < e , where f can be represented as a linear 
combination of irreducible characters with at most d non-identity component characters. 

Theorem 12. For sign matrix 

A= \figiix\y 1 ),--- ^(x^y*)]^ 

where all gi satisfy their orthogonality conditions, we have 



Q(A) > min V log 2 - f™}y „ - 0(1) 
'-{x,},s^ s 52 deg( X *)||M Xi09 J 



where the minimum is over all S C [n] with \S\ > deg 1 / 3 (/) 7 and all non-identity irreducible 
characters Xi of Gi . 



Proof. Recall that an irreducible character \ of G is the tensor product of irreducible char- 
acters Xi °f each component group Gi . Let Hard be the set of irreducible characters x with 
more than d non-identity component characters. Fix a hard character x, an d denote by S 
the set of coordinates of its non-identity characters. 

\\bc(9(x,v))]\\ = \\®bci(9i(* i ,v i ))]\\ 

ie[t] 

= U\\\xM* i ,V i ))]\\ 

»e[t] 

=ni!fe(ft(^^ i ))]iixnii j i^ixmiii 



=nn M x^n x iiv size ( M ^ 



Thus by Theorem 11 and Eq. (5), we have 



ies 

proving the theorem. 



(10) 



Previous sections as well as [SZ09b] consider the case where all g^s are the same and all 
Gj's are Z 2 . In this case, the above bound is equal to the one in Theorem 8, and the following 
proposition says that the group invariance condition degenerates to the strongly balanced 
property. 

Proposition 5. For G = Z£ *, the following two conditions for g = (gi, ■ ■ ■ ,g t ) are equiva- 
lent: 

1. The multisets S x > x ' = {(g(x, y), g(x', y)) :yeY} andT™' = {(g(x,y),g(x,y')) : x e X} 
are G invariant for any (x,x') and any (y,y'), 

2. Each matrix [gi(x l ,y l )} x i^ y i is strongly balanced. 

Proof. 1 2: S x,x being G invariant implies that for all {zi}, {ui}, {vi}, 

\{y : Zigi(x\y l ) = m, Zigi(x'\y l ) = u», Vi}| 
= \{y : g t (x\y l ) = u u g t {x'\y l ) = v u \/i}\. 

Take x' = x and u = v. Now for each i and each row x l , take z% = — 1 (where the group Z 2 
is represented by {±1}). For all other i' =/= i, take zy = 1. This assignment will show that 

\{y : g l (x\y t ) = -Ui}\ = \{y : g l {x\y l ) = u,}\. 

That is, the row x l in matrix [gi(x l , y 1 )} is balanced. Similarly we can show the balance for 
each column. 

2 =>■ 1: It is enough to show that for each i and each {zi} , {m} 7 {vi} , 



\{y l : z l g l (x\y l ) = u u z l g l (x n ,y l ) = Vi}\ 
Z \W 9t(x l ,y l ) = Ui, g t (x n ,y l ) = Vi}\. 



First consider the case x n — x % . If Ui ^ Vi then both numbers are 0; if Ui = Vi then both 
numbers are |{y*}|/2 by the balance of row x % . Now assume x' 1 ^ x % . Denote aw — \{y l : 



gi(x l ,y l ) = b, gi{x l ,y l ) — b'}\, then the above requirement amounts to a o = an and 

Note that we have 

aoo +aoi = \{y 2 ■ gt{x\y l ) = 0}| = \{y l : g t {x\y l ) = 1}| = a w + a u 

where the second equality is due to the balance of row x % . And similarly we have aoo + aio = 
aoi + an by balance of row x n . Combining the two, we get a 00 = an and a i = aio as 
desired. 

It is worth noting that the conclusion does not hold if any group Gi with size larger than 
two. We omit the counterexamples here. 
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